AITopics | regularization method

In this paper, we study norm-based regularization methods for neural networks. We compare existing penalization approaches and introduce two regularization strategies that extend classical ridge- and lasso-type penalties to neural network models. The first strategy modifies weight decay by incorporating the covariance structure of the input features into a ridge-type $\ell_2$ penalty, allowing regularization to account for feature dependence. The second combines an $\ell_1$ sparsity penalty with covariance-aware $\ell_2$ regularization, producing neural network weights that are both sparse and structurally informed. Monte Carlo simulations are used to evaluate these methods under different data-generating settings, followed by two real-data applications on building cooling-load prediction and leukemia cell-type classification from high-dimensional gene expression data. Across simulated and real-data examples, the proposed regularizers improve predictive performance on unseen data and provide more effective complexity control than standard norm-based penalties, particularly when features are correlated or high-dimensional.

artificial intelligence, cov, machine learning, (17 more...)

arXiv.org Machine Learning

2605.00171

Country: Europe > Sweden (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Rethinking Calibration of Deep Neural Networks: Do Not Be Afraid of Overconfidence

Neural Information Processing SystemsApr-26-2026, 04:51:45 GMT

Capturing accurate uncertainty quantification of the predictions from deep neural networks is important in many real-world decision-making applications. A reliable predictor is expected to be accurate when it is confident about its predictions and indicate high uncertainty when it is likely to be inaccurate. However, modern neural networks have been found to be poorly calibrated, primarily in the direction of overconfidence. In recent years, there is a surge of research on model calibration by leveraging implicit or explicit regularization techniques during training, which achieve well calibration performance by avoiding overconfident outputs. In our study, we empirically found that despite the predictions obtained from these regularized models are better calibrated, they suffer from not being as calibratable, namely, it is harder to further calibrate these predictions with post-hoc calibration methods like temperature scaling and histogram binning. We conduct a series of empirical studies showing that overconfidence may not hurt final calibration performance if post-hoc calibration is allowed, rather, the penalty of confident outputs will compress the room of potential improvement in post-hoc calibration phase. Our experimental findings point out a new direction to improve calibration of DNNs by considering main training and post-hoc calibration as a unified framework.

artificial intelligence, calibration, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.47)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

15f4cefb0e143c7ad9d40e879b0a9d0c-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 06:02:43 GMT

artificial intelligence, loss function, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.15)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

05311655a15b75fab86956663e1819cd-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 11:29:35 GMT

artificial intelligence, machine learning, manifold, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.79)

Add feedback

06b71ad997f7e3e4b2e2f2ea12e5a759-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 09:48:45 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America (0.46)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(4 more...)

Add feedback

Regularizing Attention Scores with Bootstrapping

Chung, Neo Christopher, Laletin, Maxim

arXiv.org Machine LearningApr-3-2026

Vision transformers (ViT) rely on attention mechanism to weigh input features, and therefore attention scores have naturally been considered as explanations for its decision-making process. However, attention scores are almost always non-zero, resulting in noisy and diffused attention maps and limiting interpretability. Can we quantify uncertainty measures of attention scores and obtain regularized attention scores? To this end, we consider attention scores of ViT in a statistical framework where independent noise would lead to insignificant yet non-zero scores. Leveraging statistical learning techniques, we introduce the bootstrapping for attention scores which generates a baseline distribution of attention scores by resampling input features. Such a bootstrap distribution is then used to estimate significances and posterior probabilities of attention scores. In natural and medical images, the proposed \emph{Attention Regularization} approach demonstrates a straightforward removal of spurious attention arising from noise, drastically improving shrinkage and sparsity. Quantitative evaluations are conducted using both simulation and real-world datasets. Our study highlights bootstrapping as a practical regularization tool when using attention scores as explanations for ViT. Code available: https://github.com/ncchung/AttentionRegularization

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

2604.01339

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Africa > Middle East > Morocco > Tanger-Tetouan-Al Hoceima Region > Tangier (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.48)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Knowledge Graph Completion by Intermediate Variables Regularization

Neural Information Processing SystemsMar-22-2026, 11:06:20 GMT

Knowledge graph completion (KGC) can be framed as a 3-order binary tensor completion task. Tensor decomposition-based (TDB) models have demonstrated strong performance in KGC. In this paper, we provide a summary of existing TDB models and derive a general form for them, serving as a foundation for further exploration of TDB models. Despite the expressiveness of TDB models, they are prone to overfitting. Existing regularization methods merely minimize the norms of embeddings to regularize the model, leading to suboptimal performance.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.81)

Add feedback

eb1e78328c46506b46a4ac4a1e378b91-Paper.pdf

Neural Information Processing SystemsFeb-19-2026, 08:35:12 GMT

gradient descent, psgd, quantization, (12 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.05)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Knowledge Graph Completion by Intermediate Variables Regularization

Neural Information Processing SystemsFeb-18-2026, 02:20:53 GMT

Existing regularization methods merely minimize the norms of embeddings to regularize the model, leading to suboptimal performance.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Africa > Senegal > Kolda Region > Kolda (0.04)
Asia > China (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.41)

Add feedback

Well-tunedSimpleNetsExcelon TabularDatasets

Neural Information Processing SystemsFeb-11-2026, 03:39:46 GMT

Weempirically assess theimpact oftheseregularization cocktailsforMLPs ina large-scale empirical study comprising 40 tabular datasets and demonstrate that (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditionalMLmethods,suchasXGBoost.

artificial intelligence, deep learning, machine learning, (20 more...)

Neural Information Processing Systems

Country: